Sockets vs RDMA Interface over 10-Gigabit Networks: An In-depth analysis of the Memory Traffic Bottleneck
نویسندگان
چکیده
The compute requirements associated with the TCP/IP protocol suite have been previously studied by a number of researchers. However, the recently developed 10-Gigabit Networks such as 10Gigabit Ethernet and InfiniBand have added a new dimension of complexity to this problem, Memory Traffic. While there have been previous studies which show the implications of the memory traffic bottleneck, to the best of our knowledge, there has been no study which shows the actual impact of the memory accesses generated by TCP/IP for 10-Gigabit networks. In this paper, we do an in-depth evaluation of the various aspects of the TCP/IP protocol suite including the memory traffic and CPU requirements, and compare these with RDMA capable network adapters, using 10Gigabit Ethernet and InfiniBand as example networks. Our measurements show that while the host based TCP/IP stack has a high CPU requirement, up to about 80% of this overhead is associated with the core protocol implementation especially for large messages and is potentially offloadable using the recently proposed TCP Offload Engines. However, the host based TCP/IP stack also requires multiple transactions of data over the current moderately fast memory buses (up to a factor of four in some cases), i.e., for 10-Gigabit networks, it generates enough memory traffic to saturate a typical memory bus while utilizing less than 35% of the peak network bandwidth. On the other hand, we show that the RDMA interface requires up to four times lesser memory traffic and has almost zero CPU requirement for the data sink. These measurements show the potential impacts of having an RDMA interface over IP on 10-Gigabit networks.
منابع مشابه
Performance Evaluation of RDMA over IP: A Case Study with the Ammasso Gigabit Ethernet NIC
Remote Direct Memory Access (RDMA) has been proposed to overcome the limitations of traditional send/receive based communication protocols like sockets. The immense potential of RDMA to improve the communication performance while being extremely conservative on resource requirements has made RDMA the most sought after feature in current and next generation networks. Recently, there are many act...
متن کاملPerformance Evaluation of RDMA over IP: A Case Study with Ammasso Gigabit Ethernet NIC
Remote Direct Memory Access (RDMA) has been proposed to overcome the limitations of traditional send/receive based communication protocols. The immense potential of RDMA to improve the communication performance while being extremely conservative on resource requirements have made RDMA the most sought after feature in current and next generation networks. Recently, there are many active efforts ...
متن کاملEvaluation of RDMA Over Ethernet Technology for Building Cost Effective Linux Clusters
Remote Direct Memory Access (RDMA) is an effective technology for reducing system load and improving performance. Recently, Ethernet offerings that exploit RDMA technology have become available that can potentially provide a high-performance fabric for MPI communications at lower cost than other competing technologies. The goal of this paper is to evaluate RDMA over gigabit Ethernet (ROE) as a ...
متن کاملLow-Latency Linux Drivers for Ethernet over High-Speed Networks
Nowadays, high computing demands are often tackled by clusters of single computers, each of which is basically an assembly of a growing number of CPU cores and main memory, also called a node; these nodes are connected by some kind of communication network. With the growing speed and number of CPU cores, the network becomes a severe bottleneck limiting overall cluster performance. Highspeed int...
متن کاملComparative Performance Analysis of RDMA-Enhanced Ethernet
Since the advent of high-performance distributed computing, system designers and end-users have been challenged with identifying and exploiting a communications infrastructure that is optimal for a diverse mix of applications in terms of performance, scalability, cost, wiring complexity, protocol maturity, versatility, etc. Today, the span of interconnect options for a cluster typically ranges ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004